69 research outputs found

    A general modular framework for gene set enrichment analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Analysis of microarray and other high-throughput data on the basis of gene sets, rather than individual genes, is becoming more important in genomic studies. Correspondingly, a large number of statistical approaches for detecting gene set enrichment have been proposed, but both the interrelations and the relative performance of the various methods are still very much unclear.</p> <p>Results</p> <p>We conduct an extensive survey of statistical approaches for gene set analysis and identify a common modular structure underlying most published methods. Based on this finding we propose a general framework for detecting gene set enrichment. This framework provides a meta-theory of gene set analysis that not only helps to gain a better understanding of the relative merits of each embedded approach but also facilitates a principled comparison and offers insights into the relative interplay of the methods.</p> <p>Conclusion</p> <p>We use this framework to conduct a computer simulation comparing 261 different variants of gene set enrichment procedures and to analyze two experimental data sets. Based on the results we offer recommendations for best practices regarding the choice of effective procedures for gene set enrichment analysis.</p

    Discovering genetic interactions based on natural genetic variation

    Get PDF
    Complex traits can be attributed to the effect of two or more genes and their interaction with each other as well as the environment. Unraveling the genetic cause of these traits, especially with regard to disease etiology, is a major goal of current research in statistical genetics. Much effort has been invested in the development of methods detecting genetic loci that are linked to variation of disease traits or intermediate molecular phenotypes such as gene expression levels. A very important aspect to be considered in the modeling of genotype-phenotype associations is that genes often interact with each other in a non-additive fashion, a phenomenon called epistasis. A special case of an epistatic interaction is an allele incompatibility, which is characterized by the inviability of all individuals carrying a certain combination of alleles at two distinct loci in the genome. The relevance and distribution of allele incompatibilities has not been investigated on a genome-wide scale in mammals. In this thesis, I propose a method for inferring allele incompatibilities that is exclusively based on DNA sequence information. We make use of genome-wide SNP data of parent-child trios and inspect 3×3 contingency tables for detecting pairs of alleles from different genomic positions that are under-represented in the population. Our method detected substantially more imbalanced allele pairs than what we got in simulations assuming no interactions. We could validate a significant number of the interactions with external data and we found that interacting loci are enriched for genes involved in developmental processes. Genes do not only interact with one another, their regulatory activity also depends on the environment or cellular context. The impact of genetic variation on gene expression will therefore also depend on cell types or on the cellular state. This aspect has long been neglected in the inference of genetic loci that are linked to gene expression variation (expression quantitative trait loci, eQTL). There is thus a need to develop methods for analyzing the variation of eQTL between different cell types and to assess the impact of genetic variation on expression dynamics rather than just static expression levels. In the second part of this thesis, I show that defining and detecting eQTL regulating expression dynamics is non-trivial. I propose to distinguish “static”, "conditional” and “dynamic” eQTL and suggest new strategies for mapping these eQTL classes. By using murine mRNA expression data from four stages of hematopoiesis, we demonstrate that eQTL from the above three classes yield associations with different modes of expression regulation. Intriguingly, dynamic and conditional eQTL complement one another although they are based on integration of the same expression data. We reveal substantial effects of individual genetic variation on cell state specific expression regulation

    Accounting for Redundancy when Integrating Gene Interaction Databases

    Get PDF
    During the last years gene interaction networks are increasingly being used for the assessment and interpretation of biological measurements. Knowledge of the interaction partners of an unknown protein allows scientists to understand the complex relationships between genetic products, helps to reveal unknown biological functions and pathways, and get a more detailed picture of an organism's complexity. Being able to measure all protein interactions under all relevant conditions is virtually impossible. Hence, computational methods integrating different datasets for predicting gene interactions are needed. However, when integrating different sources one has to account for the fact that some parts of the information may be redundant, which may lead to an overestimation of the true likelihood of an interaction. Our method integrates information derived from three different databases (Bioverse, HiMAP and STRING) for predicting human gene interactions. A Bayesian approach was implemented in order to integrate the different data sources on a common quantitative scale. An important assumption of the Bayesian integration is independence of the input data (features). Our study shows that the conditional dependency cannot be ignored when combining gene interaction databases that rely on partially overlapping input data. In addition, we show how the correlation structure between the databases can be detected and we propose a linear model to correct for this bias. Benchmarking the results against two independent reference data sets shows that the integrated model outperforms the individual datasets. Our method provides an intuitive strategy for weighting the different features while accounting for their conditional dependencies

    Systematic Detection of Epistatic Interactions Based on Allele Pair Frequencies

    Get PDF
    Epistatic genetic interactions are key for understanding the genetic contribution to complex traits. Epistasis is always defined with respect to some trait such as growth rate or fitness. Whereas most existing epistasis screens explicitly test for a trait, it is also possible to implicitly test for fitness traits by searching for the over- or under-representation of allele pairs in a given population. Such analysis of imbalanced allele pair frequencies of distant loci has not been exploited yet on a genome-wide scale, mostly due to statistical difficulties such as the multiple testing problem. We propose a new approach called Imbalanced Allele Pair frequencies (ImAP) for inferring epistatic interactions that is exclusively based on DNA sequence information. Our approach is based on genome-wide SNP data sampled from a population with known family structure. We make use of genotype information of parent-child trios and inspect 3×3 contingency tables for detecting pairs of alleles from different genomic positions that are over- or under-represented in the population. We also developed a simulation setup which mimics the pedigree structure by simultaneously assuming independence of the markers. When applied to mouse SNP data, our method detected 168 imbalanced allele pairs, which is substantially more than in simulations assuming no interactions. We could validate a significant number of the interactions with external data, and we found that interacting loci are enriched for genes involved in developmental processes

    An eV-scale sterile neutrino search using eight years of atmospheric muon neutrino data from the IceCube Neutrino Observatory

    Get PDF
    The results of a 3+1 sterile neutrino search using eight years of data from the IceCube Neutrino Observatory are presented. A total of 305,735 muon neutrino events are analyzed in reconstructed energy-zenith space to test for signatures of a matter-enhanced oscillation that would occur given a sterile neutrino state with a mass-squared differences between 0.01\,eV2^2 and 100\,eV2^2. The best-fit point is found to be at sin⁡2(2ξ24)=0.10\sin^2(2\theta_{24})=0.10 and Δm412=4.5eV2\Delta m_{41}^2 = 4.5{\rm eV}^2, which is consistent with the no sterile neutrino hypothesis with a p-value of 8.0\%.Comment: 11 pages, 5 figures. This letter is supported by the long-form paper "Searching for eV-scale sterile neutrinos with eight years of atmospheric neutrinos at the IceCube neutrino telescope," also appearing on arXiv. Digital data release available at: https://github.com/icecube/HE-Sterile-8year-data-releas

    Searching for eV-scale sterile neutrinos with eight years of atmospheric neutrinos at the IceCube neutrino telescope

    Get PDF
    We report in detail on searches for eV-scale sterile neutrinos, in the context of a 3+1 model, using eight years of data from the IceCube neutrino telescope. By analyzing the reconstructed energies and zenith angles of 305,735 atmospheric ΜΌ\nu_\mu and ΜˉΌ\bar{\nu}_\mu events we construct confidence intervals in two analysis spaces: sin⁥2(2Ξ24)\sin^2 (2\theta_{24}) vs. Δm412\Delta m^2_{41} under the conservative assumption Ξ34=0\theta_{34}=0; and sin⁥2(2Ξ24)\sin^2(2\theta_{24}) vs. sin⁥2(2Ξ34)\sin^2 (2\theta_{34}) given sufficiently large Δm412\Delta m^2_{41} that fast oscillation features are unresolvable. Detailed discussions of the event selection, systematic uncertainties, and fitting procedures are presented. No strong evidence for sterile neutrinos is found, and the best-fit likelihood is consistent with the no sterile neutrino hypothesis with a p-value of 8\% in the first analysis space and 19\% in the second.Comment: This long-form paper is a companion to the letter "An eV-scale sterile neutrino search using eight years of atmospheric muon neutrino data from the IceCube Neutrino Observatory". v2: update other experiments contours on results plo

    é«˜ç­‰ć°‚é–€ć­Šæ Ąç”šć…±é€šæ•™æă€Œæ–°çŽ æIV.è€‡ćˆææ–™ç·šă€ăźè©•äŸĄèȘżæŸ»ăźç”æžœ(䞭扉)

    Get PDF
    The presence of a population of point sources in a data set modifies the underlying neutrino-count statistics from the Poisson distribution. This deviation can be exactly quantified using the non-Poissonian template fitting technique, and in this work we present the first application of this approach to the IceCube high-energy neutrino data set. Using this method, we search in 7 yr of IceCube data for point-source populations correlated with the disk of the Milky Way, the Fermi bubbles, the Schlegel, Finkbeiner, and Davis dust map, or with the isotropic extragalactic sky. No evidence for such a population is found in the data using this technique, and in the absence of a signal, we establish constraints on population models with source-count distribution functions that can be described by a power law with a single break. The derived limits can be interpreted in the context of many possible source classes. In order to enhance the flexibility of the results, we publish the full posterior from our analysis, which can be used to establish limits on specific population models that would contribute to the observed IceCube neutrino flux

    Increasing European Support for Neglected Infectious Disease Research

    No full text
    Neglected infectious diseases (NIDs) are a persistent cause of death and disability in low-income countries. Currently available drugs and vaccines are often ineffective, costly or associated with severe side-effects. Although the scale of research on NIDs does not reflect their disease burden, there are encouraging signs that NIDs have begun to attract more political and public attention, which have translated into greater awareness and increased investments in NID research by both public and private donors. Using publicly available data, we analysed funding for NID research in the European Union's (EU's) 7th Framework Programme for Research and Technological Development (FP7), which ran from 2007 to 2013. During FP7, the EU provided €169 million for 65 NID research projects, and thereby placed itself among the top global funders of NID research. Average annual FP7 investment in NID research exceeded €24 million, triple that committed by the EU before the launch of FP7. FP7 NID projects involved research teams from 331 different institutions in 72 countries on six continents, underlining the increasingly global nature of European research activities. NID research has remained a priority in the current EU Framework Programme for research and innovation, Horizon 2020, launched in 2014. This has most notably been reflected in the second programme of the European & Developing Countries Clinical Trials Partnership (EDCTP), which provides unprecedented opportunities to advance the clinical development of new medical interventions against NIDs. Europe is thus better positioned than ever before to play a major role in the global fight against NIDs

    Expected genotype probabilities.

    No full text
    <p>Expected genotype probabilities in the offspring for each possible allele combination of the parents.</p
    • 

    corecore